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We investigate the mechanical unfolding of the tenth type III 
domain from fibronectin, FnllliQ, botli at constant force and at 
constant pulling velocity, by all-atom Monte Carlo simulations. 
We observe both apparent two-state unfolding and several 
unfolding pathways involving one of three major, mutually 
exclusive intermediate states. All the three major intermediates 
lack two of seven native fS-strands, and share a quite similar 
extension. The unfolding behavior is found to depend strongly on 
the pulling conditions. In particular, we observe large variations 
in the relative frequencies of occurrence for the intermediates. 
At low constant force or low constant velocity, all the three 
major intermediates occur with a significant frequency. At high 
constant force or high constant velocity, one of them, with the 
N- and C-terminal P -strands detached, dominates over the other 
two. Using the extended Jarzynski equality, we also estimate 
the equilibrium free-energy landscape, calculated as a function 
of chain extension. The application of a constant pulling force 
leads to a free-energy profile with three major local minima. 
Two of these correspond to the native and fully unfolded states, 
respectively, whereas the third one can be associated with the 
major unfolding intermediates. 
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Fibronectin is a giant multimodular protein that exists in both sol- 
uble (dimeric) and fibrillar forms. In its fibrillar form, it plays a 
central role in cell adhesion to the extracellular matrix. Increas- 
ing evidence indicates that mechanical forces exerted by cells are 
a key player in initiation of fibronectin fibrillogenesis as well as 
in modulation of cell-fibronectin adhesion, and thus may regulate 
the form and function of fibronectin [1,2]. 

Each fibronectin monomer contains more than 20 modules of 
three types, called Fnl-III. The most common type is Fnlll, with 
~90 amino acids and a /?-sandwich fold. Two critical sites for 
the interaction between cells and fibronectin are the RGD motif 
Arg78-Gly79-Asp80 [3] on the tenth Fnlll module, Fnlllio, and 
a synergistic site [4] on the ninth Fnlll module, which bind to 
cell-surface integrins. In the native structure of Fnllljo, shown 
in Fig. 1, the RGD motif is found on the loop connecting the C- 
terminal /?-strands F and G. It has been suggested that a stretching 
force can change the distance between these two binding sites suf- 
ficiently to affect the cell-adhesion properties, without deforming 
the sites themselves [2]. Force could also influence the adhesion 
properties by causing full or partial unfolding of the Fnllljo mod- 
ule, and thereby deformation of the RGD motif [5]. Whether or 
not mechanical unfolding of fibronectin modules occurs in vivo is 
controversial. It is known that cell-generated force can extend 
fibronectin fibrils to several times their unstretched length [6]. 
There are experiments indicating that this extensibility is due to 
changes in quaternary structure rather than unfolding [7], while 
other experiments indicate that the extensibility originates from 
force-induced unfolding of Fnlll modules [8,9]. Also worth not- 
ing is that the Fnllljo module is capable of fast refolding [10]. 

Atomic force microscopy (AFM) experiments have provided 
important insights into the mechanical properties of Fnlll mod- 
ules [11-13]. Interestingly, it was found that, although thermo- 
dynamically very stable [14], the cell-binding module Fnllljo is 
mechanically one of the least stable Fnlll modules [11]. Further, 
it was shown that the force-induced unfolding of Fnllljo often 
occurs through intermediate states [12]. While apparent one-step 
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Figure 1: Schematic illustration of the NMR-derived [53] native structure 
of Fnllljo (Protein Data Bank ID Ittf). Its seven /^-strands are labeled 
A-G in sequence order 

events were seen as well, a majority of the unfolding events had 
a clear two-step character [12]. A recent AFM study of pH de- 
pendence [13] suggests that electrostatic contributions are less 
important for the mechanical stability of Fnllljo than previously 
thought. 

Several groups have used computer simulations to investigate 
the force-induced unfolding of Fnlllig [5, 15-20]. An early study 
predicted the occurrence of intermediate states [15]. In these 
simulations, two unfolding pathways were seen, both proceeding 
through partially unfolded intermediate states. Both intermedi- 
ates lacked two of the seven native /J-strands. The missing strands 
were A and B in one case, and A and G in the other (for strand 
labels, see Fig. 1). A more recent study reached somewhat differ- 
ent conclusions [17]. This study found three different pathways, 
only one of which involved a partially unfolded intermediate state, 
with strands A and B detached. The experiments [12] are con- 
sistent with the existence of the two different intermediates seen 
in the early simulations [15], but do not permit an unambiguous 
identification of the states. When comparing the experiments with 
these simulations, it should be kept in mind that the forces studied 
in the simulations were larger than those studied experimentally. 

Here we use an implicit-water all-atom model with a simple 
and computationally convenient energy function [21,22] to inves- 
tigate how the response of Fnllljo to a stretching force depends 
on the pulling strength. We study the unfolding behavior both at 
constant force and at constant pulling velocity. Some previous 
studies were carried out using explicit-solvent models [5, 17, 18]. 
These models might capture important details that our implicit- 
solvent model ignores, like weakening of specific hydrogen bonds 
through interactions between water molecules and the protein 
backbone [23]. The advantage of our model is computational con- 
venience. The relative simplicity of the model makes it possible 



for us to generate a large set of unfolding events, which is impor- 
tant when studying a system with multiple unfolding pathways. 

Our analysis of the generated unfolding trajectories consists of 
two parts. The first part aims at characterizing the major unfold- 
ing pathways and unfolding intermediates. In the second part, 
we use the extended Jarzynski equality (EJE) [24—26] to estimate 
the equilibrium free-energy landscape, calculated as a function 
of end-to-end distance. This analysis extends previous work on 
simplified protein models [27-30] to an atomic-level model. This 
level of detail may be needed to facilitate comparisons with fu- 
ture EJE reconstructions based on experimental data. Indeed, two 
applications of this method to experimental protein data were re- 
cently reported [31,32]. 

Model and Methods 

Model 

We use an all-atom model with implicit water, torsional degrees 
of freedom, and a simplified energy function [21,22]. The energy 
function 

E = ^loc + ^ev + ^hb + ^hp (1) 

is composed of four terms. The term E\q(^ is local in sequence 
and represents an electrostatic interaction between adjacent pep- 
tide units along the chain. The other three terms are non-local in 
sequence. The excluded volume term iigv is a 1/r'^ repulsion 
between pairs of atoms. Ehi, represents two kinds of hydrogen 
bonds: backbone-backbone bonds and bonds between charged 
side chains and the backbone. The last term E^p represents an 
effective hydrophobic attraction between nonpolar side chains. It 
is a simple pairwise additive potential based on the degree of con- 
tact between two nonpolar side chains. The precise form of the 
different interaction terms and the numerical values of all geome- 
try parameters can be found elsewhere [21,22]. 

It has been shown that this model, despite its simplicity, pro- 
vides a good description of the structure and folding thermody- 
namics of several peptides with different native geometries [22]. 
For the significantly larger protein FnllliQ, it is computationally 
infeasible to verify that the native structure is the global free- 
energy minimum. However, in order to study unfolding, it is suf- 
ficient that the native state is a local free-energy minimum. In 
our model, with unchanged parameters [21,22], the native state 
of Fnllljo, indeed, is a long-lived state conesponding to a free- 
energy minimum, as will be seen below. 

The same model has previously been used to study both me- 
chanical and thermal unfolding of ubiquitin [33, 34]. In agree- 
ment with AFM experiments [35], it was found that ubiquitin, 
like Fnlllio, displays a mechanical unfolding intermediate far 
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from the native state, and this intermediate was characterized [33]. 
The picture emerging from this study [33] was subsequently sup- 
ported by ubiquitin simulations based on completely different 
models [36-38]. 

The energy function E of Eq.[T] describes an unstretched pro- 
tein. In our calculations, the protein is pulled either by a constant 
force or with a constant velocity. In the first case, constant forces 
— F and F act on the N and C termini, respectively. The full en- 
ergy function is then given by 

£tot = E-F R (2) 

where R is the vector from the N to the C terminus. In the 
constant-velocity simulations, the pulling of the protein is mod- 
eled using a harmonic potential in the end-to-end distance L = 
\R\ whose minimum Lo (?) varies linearly with Monte Carlo (MC) 
time t. With this external potential, the full, time-dependent en- 
ergy function becomes 

£tot(0 ^E + ^[L„{t)-L]^ = E+''-[LQ + vt-Lf (3) 

where A; is a spring constant, v is the pulling velocity, and Lq is 
the initial equilibrium position of the spring. The spring constant, 
corresponding to the cantilever stiffness in AFM experiments, is 
set to k = 37pN/nm. The experimental Fnllljo study of [12] 
reported a typical spring constant of ~ 50 pN/nm. 

Simulation methods 

Using MC dynamics, we study six constant force magnitudes F 
(50 pN, 80 pN, 100 pN, 120 pN, 150 pN and 192 pN) and four con- 
stant pulling velocities v (0.03 fm/MC step, 0.05 fm/ MC step, 
0. 10 fm/MC step and 1 .0 fm/MC step), at a temperature of 288 K. 
Three different types of MC updates are used: (i) Biased Gaus- 
sian Steps [39], BGS, which are semi-local updates of backbone 
angles; (ii) single-variable Metropolis updates of side-chain an- 
gles; and (iii) small rigid-body rotations of the whole chain. The 
BGS move simultaneously updates up to eight consecutive back- 
bone angles, in a manner that keeps the chain ends approximately 
fixed. In the constant-velocity simulations, the time-dependent 
parameter L„{t) is changed after every attempted MC step. 

As a starting point for our simulations, we use a model ap- 
proximation of the experimental Fnlllio structure (backbone root- 
mean-square deviation 0.2 nm), obtained by simulated anneal- 
ing. All simulations are started from this initial structure, with 
different random number seeds. However, in the constant-velocity 
runs, the system is first thermalized in the potential E + k{L() — 
L)^ /2 for 10^ MC steps (Lq — 3.8 nm), before the actual simu- 
lation is started at r = 0. The thermalization is a prerequisite for 
the Jarzynski analysis (see below). 



The constant-force simulations are run for a fixed time, which 
depends on the force magnitude. There are runs in which the pro- 
tein remains folded over the whole time interval studied. The 
constant-velocity simulations are run until the spring has been 
pulled a distance of vt = 35 nm. At this point, the protein is 
always unfolded. 

Our simulations are carried out using the program package 
PROFASI [40], which is a C-l~l- implementation of this model. 
3D structures are drawn with PyMOL [41]. 

Analysis of pathways and intermediates 

To characterize pathways and intermediates, we study the evo- 
lution of the native secondary-structure elements along the un- 
folding trajectories. For this purpose, during the course of the 
simulations, all native hydrogen bonds connecting two /?-strands 
(see Fig. 1) are monitored. A bond is defined as present if the 
energy of that bond is lower than a cutoff {—lAk^T). Using this 
data, we can describe a configuration by which pairs of ^-strands 
are formed. A jS-strand pair is said to be formed if more than a 
fraction 0.3 of its native hydrogen bonds are present. Whether 
individual /?-strands are present or absent is determined based on 
which /?-strand pairs the conformation contains. 

The characterization of intermediate states requires slightly dif- 
ferent procedures in the respective cases of constant force and 
constant velocity. For constant force, a histogram of the end-to- 
end distance L, covering the interval 3 nm < L < 27 nm, is made 
for each unfolding trajectory. Each peak in the histogram corre- 
sponds to a metastable state along the unfolding pathway. To re- 
duce noise the histogram is smoothed with a sliding L window of 
0.3 nm. Peaks higher than a given cutoff are identified. Two peaks 
that are close to each other are only considered separate states if 
the values between them drop below half the height of the small- 
est peak. The position of an intermediate, Lj, is calculated as a 
weighted mean over the corresponding peak. The area under the 
peak provides, in principle, a measure, rj, of the life time of the 
state. However, due to statistical difficulties, we do not measure 
average life times of intermediate states. 

In the constant- velocity runs, the unraveling of the native state 
or an intermediate state is associated with a rupture event, at 
which a large drop in force occurs. To ascertain that we register 
actual rupture events and not fluctuations due to thermal noise, the 
force versus time curves are smoothed with a sliding time window 
of Tw = 0.3 nm/o, where v is the pulling velocity. Rupture events 
are identified as drops in force that are larger than 25 pN within a 
time less than Tw. The point of highest force just before the drop 
defines the rupture force, Fj, and the end-to-end distance, Lj, of 
the corresponding state. Only rupture events with a time separa- 
tion of at least 2Tw are considered separate events. The rupture 
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force Fi is a stability measure statistically easier to estimate than 
the life time q at constant force. 

For a peak with a given Lj, to decide which yS-strands the cor- 
responding state contains, we consider all stored configurations 
with|L — Lil < O.lnm. All /?-strand pairs occurring at least once 
in these configurations are considered formed in the state. With 
this prescription, it happens that separate peaks from a single run 
exhibit the same set of yfi-strand pairs. Distinguishing between 
different substates with the same secondary-structure elements is 
beyond the scope of the present work. Such peaks are counted as 
a single state, with set to the weighted average position of the 
merged peaks. 

Jarzynski analysis 

From the constant-velocity trajectories, we estimate the equilib- 
rium free-energy landscape Gq{L), as a function of the end-to-end 
distance L, for the unstretched protein by using EJE [24-26,42]. 
For our system, this identity takes the form 

{S(L-L{C,))e-^''''^'^), (4) 

where is Boltzmann's constant, T the temperature, and Ct 
stands for the configuration of the system at time t. In this equa- 
tion, (. . denotes an average over trajectories Cr, < r < f, 
with starting points Cq drawn from the Boltzmann distribution 
corresponding to EtodO) (see Eq.O. The quantity Wt is the work 
done on the system along a trajectory and is given by 

to[L„(r)-L(Cr)]dr = j F AL„ (5) 

As discussed in [26,42], combining Eq. |4]with the weighted 
histogram method [43], one finds that the optimal estimate of the 
target function Go(L) is given by 

Go(L) = 

_^ ^j^r E,('?a-L(G))e-'^'/fa^),/(e-^'/^-Br) " 

(6) 

up to an additive constant. As in an experimental situation, 
for each unfolding trajectory, we sample the end-to-end distance 
L(Ct) and the work Wt at discrete time intervals A:Ar, with 
k = Q, . . . ,n and nAr = t. The sums appearing in Eq. |6]thus 
run over these discrete times. 

Let Lujin and Lmax be the minimal and maximal end-to-end 
distances, respectively, observed in the unfolding trajectories. We 
divide the interval [L^iini ^max] into sub-intervals of length AL 



pulling force or velocity 


runs 


MC steps/lO* 


50 pN 


98 


1000 


80 pN 


100 


1000 


lOOpN 


100 


250 


120pN 


200 


100 


150 pN 


340 


50 


192 pN 


600 


30 


0.03 fm/MC .step 


100 


1 167 


0.05 fm/MC step 


99 


700 


0.10 fm/MC .step 


99 


350 


1.0 fm/MC step 


200 


35 



Table 1: Number of mns and the length of each ran, in number of ele- 
mentary MC steps, at the different pulling conditions studied. 

and evaluate Gq(Li) for each Lj — Lniin + + 1/2)AL by 
exploiting Eq. |6] The two averages appearing in this equation 
are estimated as 0,(L(C;)) exp(— /k^iT) and exp{—Wt /k^T), 
where the bar indicates an average over trajectories and the func- 
tion 9j(x) is defined as 9i{x) = 1 if \x — L,| < AL/2 and 
6i{x) — otherwise. Further details on the scheme used can 
be found in [42]. 

Results 

Description of the calculated unfolding traces 

We study the mechanical unfolding of Fnlllio for six constant 
forces and four constant velocities. Table 1 shows the number of 
runs and the length of each run in these ten cases. At low force or 
low velocity, it takes longer for the protein to unfold, which makes 
it necessary to use longer and computationally more expensive 
trajectories. 

Fig. 2 shows the time evolution of the end-to-end distance L in 
a representative set of runs at constant force (lOOpN). Typically 
each trajectory starts with a long waiting phase with L ~ 5 nm, 
where the molecule stays close to the native conformation. In 
this phase, the relative orientation of the two yS-sheets (see Fig. 1) 
might change, but all native /J-strands remain unbroken. The 
waiting phase is followed by a sudden increase in L. This step 
typically leads either directly to the completely unfolded state 
with L ~ 30 nm or, more commonly, to an intermediate state 
at L ~ 12-16 nm. The intermediate is in turn unfolded in another 
abrupt step that leads to the completely stretched state. In a small 
fraction of the trajectories, depending on force, the protein is still 
in the native state or an intermediate state when the simulation 
stops. Intermediates outside the range 12-16 nm are unusual but 
occur in some runs. For example, a relatively long-lived interme- 
diate at 21 nm can be seen in one of the mns in Fig. 2. 
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Figure 2: MC time evolution of tlie end-to-end distance in 42 indepen- 
dent simulations with a constant pulling force of 100 pN. The three most 
frequent intermediates lack different pairs of native yS-strands: AG, FG, or 
AB. Trajectories in which these states occur are labeled green, blue and 
red, respectively. Apparent two-state events are colored black. 



Fig. 3 shows samples of unfolding traces at constant velocity 
(0.05 fm/MC step). Here force is plotted against end-to-end dis- 
tance. As in the constant-force runs, there are two main events in 
most trajectories. First, the native state is pulled until it ruptures 
at L ~ 5 nm. The chain is then elongated without much resistance 
until it, in most cases, reaches an intermediate at L ~ 12-16 nm. 
Here the force increases until there is a second rupture event. Af- 
ter that, the molecule is free to elongate towards the fully unfolded 
state with L ~ 30 nm. Some trajectories have force peaks at other 
L. An unusually large peak of this kind can be seen at 22 nm 
in Fig. 3. Inspection of the corresponding structure reveals that it 
contains a three-stranded yS-sheet composed of the native CD hair- 
pin and a non-native strand. This sheet is pulled longitudinally, 
which explains why the stability is high. Another feature worth 
noting in Fig. 3 is that the pulling velocity is sufficiently small to 
permit the force to drop to small values between the peaks. 

There are several similarities between the unfolding events 
seen at constant force and at constant velocity. In most trajec- 
tories, there are stable intermediates, and the unfolding from both 
the native and intermediate states is abrupt. Also, the vast majority 
of the observed intermediates have a similar end-to-end distance, 
in the range 12-16 nm. It should be noticed that experiments typ- 
ically measure contour-length differences rather than end-to-end 
distances. Below we analyze contour-length differences between 
the native state and our calculated intermediates, which turn out 
to be in good agreement with experimental data. 

The trajectories can be divided into three categories: apparent 
two-state unfolding, unfolding through intermediate states, and 



Figure 3: Force versus end-to-end distance in 55 independent simulations 
with a constant pulling velocity of 0.05 fm/MC step. Noise has been fil- 
tered out using a sliding time window of 6 • 10^ MC steps. The color 
coding is the same as in Fig. 2, with the addition of a new category for a 
few trajectories not belonging to any of the four categories in that figure. 
These trajectories are colored grey. 

trajectories in which no unfolding takes place. Table 2 shows 
the relative frequencies of these groups at the different pulling 
conditions. The number of trajectories in which the protein re- 
mains folded throughout the run obviously depends on the trajec- 
tory length. More interesting to analyze is the ratio between the 
two kinds of unfolding, with or without intermediate states. In the 
constant-force runs, this ratio depends strongly on the magnitude 
of the applied force; unfolding through intermediates dominates 
at the lowest force, but is less common than apparent two-state un- 
folding at the highest force. In the constant-velocity runs, unfold- 
ing through intermediates is much more probable than apparent 
two-state unfolding at all the velocities studied. 

Identifying pathways and intermediates 

The fact that most observed intermediates fall in the relatively 
naiTow L interval of 12-16 nm does not mean that they are struc- 
turally similar. Actually, the data in Figs. 2 and 3 clearly indicate 
that these intermediates can be divided into three groups with sim- 
ilar but not identical end-to-end distances. The /?-strand analysis 
(see Model and Methods) reveals that these three groups corre- 
spond to the detachment of different pairs of /^-strands, namely 
A and G, A and B, or F and G. The prevalence of these partic- 
ular intermediate states is not surprising, given the native topol- 
ogy. When pulling the native structure of Fnllljo, the interior 
of the molecule is shielded from force by the N- and C-terminal 
/S-strands, A and G. Consequently, in 95 % or more of our runs. 
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pulling force or velocity 


n = 2 


n > 3 


no unfolding 


50 pN 


0.01 


0.79 


0.20 


80 pN 


0.21 


0.79 





100 pN 


0.23 


0.77 





120 pN 


0.24 


0.76 





150pN 


0.29 


0.72 


<0.01 


192 pN 


0.54 


0.46 





0.03 fm/MC step 


0.04 


0.96 





0.05 fm/MC step 


0.07 


0.93 





0.10 fm/MC step 


0.03 


0.97 





1.0 fm/MC step 





1.0 






Table 2: The fractions of trajectories in which unfolding occurs either 
in an apparent two-state manner (labeled « = 2) or through intermediate 
states (labeled n > 3). "No unfolding" refers to the fraction of trajectories 
in which the protein remains folded throughout the run (with L < 8 nm). 



either strand A or G is the first to detach, for all the pulling con- 
ditions studied. Most commonly, this detachment is followed by 
a release of the other strand of the two. But, when A (G) is de- 
tached, B (F) is also exposed to force. We thus have three main 
options for detaching two strands, AG, AB or FG, which actually 
correspond to the three major intermediates we observe. 

Intermediates outside the interval 12-16 nm also occur in our 
simulations. When applied to the intermediates with L < 12nm, 
the yS-strand analysis identifies two states with one strand de- 
tached, A or G. The intermediates with L > 16mn are scattered 
in L and correspond to rare states with more than two strands 
detached. The intermediate at 21 nm seen in one of the runs in 
Fig. 2 lacks, for example, four strands (A, B, F and G). How- 
ever, in these relatively unstructured states with more than two 
strands detached, the remaining strands are often disrupted, which 
makes the binary classification of strands as either present or ab- 
sent somewhat ambiguous. Moreover, it is not uncommon that 
these large-L intermediates contain some non-native secondary 
structure. In what follows, we therefore focus on the five states 
seen with only one or two strands detached. 

For convenience, the intermediates will be referred to by which 
strands are detached. The intermediate with strands A and B un- 
folded will thus be labeled AB, etc. Tables 3 and 4 show basic 
properties of the A, G, AB, AG and FG intermediates, as observed 
at constant force and constant velocity, respectively. 

From Tables 3 and 4, several observations can be made. A 
first one is that the average end-to-end distance, Lj, of a given 
state increases slightly with increasing force. More importantly, it 
can be seen that the relative frequencies with which the different 
intermediates occur depend strongly on the pulling conditions. At 
high force or high velocity, the AG intermediate stands out as 
the by far most common one. By contrast, at low force or low 



velocity, there is no single dominant state. In fact, at F = 50 pN 
as well as at u = 0.03 fm/MC step, all the five states occur with a 
significant frequency. 

Table 4 also shows the average rupture force, Fj, of the dif- 
ferent states, at the different pulling velocities. Although the 
data are somewhat noisy, there is a clear tendency that Fj, for 
a given state, slowly increases with increasing pulling velocity, 
which is in line with the expected logarithmic v dependence [44]. 
Comparing the different states, we find that those with only one 
strand detached (A and G) are markedly weaker than those with 
two strands detached (AG, AB and FG), as will be further dis- 
cussed below. Most force-resistant is the AB intermediate. This 
state occurs much less frequently than the AG intermediate, es- 
pecially at high velocity, but is harder to break once formed. 
Compared to experimental data, our Fj values for the interme- 
diates are somewhat large. The experiments found a relatively 
wide distribution of unfolding forces centered at 40-50 pN [12], 
which is a factor two or more lower than what we find for the 
AG, AB and FG intermediates. Our results for the unfolding 
force of the native state are consistent with experimental data. 
For the native state, the experiments found unfolding forces of 
75 ± 20pN [11] and 90 ± 20pN [12]. Our corresponding results 
are 88 ± 2 pN, 99 ± 2 pN and 1 14 ± 3 pN at d = 0.03 fm/MC step, 
V = 0.05 fm/MC step and v =0.10 fm/MC step, respectively. 

The AG, AB and FG intermediates do not only require a sig- 
nificant rupture force in our constant-velocity runs, but are also 
long-lived in our constant-force simulations. In fact, in many 
runs, the system is still in one of these states when the simulation 
ends, which means that their average life times, unfortunately, are 
too long to be determined from the present set of simulations. 
Nevertheless, there is a clear trend that the AB intermediate is 
more long-lived than the other two, which in turn have similar 
life times. The relative life times of these states in the constant- 
force runs are thus fully consistent with their force-resistance in 
the constant-velocity runs. 

At high constant force, we see a single dominant intermedi- 
ate, the AG state, but also a large fraction of events without any 
detectable intermediate. Interestingly, it turns out that the same 
two strands, A and G, are almost always the first to break in the 
apparent two-state events as well. Table 5 shows the fraction of 
all trajectories, with or without intermediates, in which A and G 
are the first two strands to break, at the different forces studied. 
At 192 pN, this fraction is as large as 98 %. Although the time 
spent in the state with strands A and G detached varies from run 
to run, there is thus an essentially deterministic component in the 
simulated events at high force. 

The unfolding behavior at low force or velocity is, by contrast, 
complex, with several possible pathways. Fig. 4 illustrates the 
relations between observed pathways at the lowest pulling veloc- 
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50pN 80pN lOOpN 120pN 150pN 192pN 



state 


/ 




/ 




/ 




/ 


il 


/ 


il 


/ 




AG 


0.46 


13.9 


0.49 


14.3 


0.65 


14.3 


0.69 


14.5 


0.69 


14.6 


0.45 


14.7 


AB 


0.35 


12.4 


0.14 


12.9 


0.09 


13.1 


0.03 


13.2 


<0.01 




<0.01 




FG 


0.15 


14.8 


0.13 


15.2 


0.03 


15.5 


0.03 


15.7 


<0.01 




<0.01 




G 


0.19 


11.1 


0.04 


11.8 






















A 


0.13 


6.7 




























Table 3: Frequency / and average extension Li (in nm) of intermediate states in the constant-force simulations. The label of a state indicates which 
^-strands are detached, that is the state AG lacks strands A and G, etc. The frequency / is the number of runs in which a given state was seen, divided 
by the total number of runs in which unfolding occurred. The statistical uncertainties on are about 0.1 nm or smaller. " — " indicates not applicable. 
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FG 


0.15 
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15.3 
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162 
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0.04 
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0.05 


54 


10.5 


0.08 
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0.20 


46 


9.9 


0.06 


67 10.3 
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0.06 


43 


6.2 


0.07 


53 


7.2 


0.09 


57 


6.9 


0.03 


81 7.2 



Table 4: Frequency /, average rupture force Fi (in pN) and average extension L\ (in nm) of intermediate states in the constant-velocity simulations. 
The statistical uncertainties are 10-20 % on F\, about 0.1 nm or smaller on Lj for AG and AB, and about 0.5 nm on Lj for FG, G and A. 



first pair 50 pN 80 pN lOOpN 120pN 150 pN 192 pN 

A&G 050 069 87 0.935 0.973 0.980 

A&B 0.35 0.15 0.09 0.025 0.006 0.007 

F&G 015 016 O04 O040 0.021 0.013 



termediate is preceded by another intermediate. 

The unfolding pattern illustrated in Fig. 4 can be partly under- 
stood by counting native hydrogen bonds. The numbers of hy- 
drogen bonds connecting the strand pairs AB, BE, CF and FG are 
npji = 7, /1BE = 5, ncF = 8 and ii^q = 6, respectively. In our 
as well as in a previous study [17], two hydrogen bonds near the 
C terminus break early in some cases, which reduces the number 
of FG bonds to n-pG — 4. The transition frequencies seen in Fig. 4 
match well with the ordering )ibe ~ «pcj < ;7ab < '^CF- The 
first branch point in Fig. 4 is the native state. Transitions from this 
state to the G state, N^G, are more common than N^A transi- 
tions, in line with the relation npQ < (iab- The second layer of 
branch points is the A and G states. That transitions G— > AG are 
more common than G— >GF and that A— >AG and A^AB have 
similar frequencies, match well with the relations /t^B < "CF 
and npQ ~ ;ibE' respectively. Finally, there are fewer hydrogen 
bonds connecting the AB hairpin to the rest of the native structure 
than what is the case for the FG hairpin, «be < "CF' which may 
explain why the AB hairpin, unlike the FG hairpin, detaches as 
one unit in some runs. 

Another feature seen from Fig. 4 is that the remaining native- 
like core rotates during the course of the unfolding process. The 
orientation of the core is crucial, because a strand is much more 
easily released if it can be unzipped one hydrogen bond at a time, 
rather than by longitudinal pulling. The detachment of the first 
strand leads, irrespective of whether it is A or G, to an arrange- 



Table 5: The fractions of all unfolding events in which the first two strands 
to break are A & G, F & G, and A&B, respectively, at different constant 
forces. The first pair to break was always one of these three. 



ity, 0.03 fm/MC step. The main unfolding path begins with the 
detachment of strand G, followed by the formation of the AG in- 
termediate, through the detachment of A. There are also runs in 
which the same intermediate occurs but A and G detach in the 
opposite order. Note that for the majority of the trajectories the 
boxes A and G in Fig. 4 only indicate passage through these states, 
not the formation of an intermediate state. In a few events, it is 
impossible to say which strand breaks first. In these events, the 
initial step is either that the hairpin AB detaches as one unit, or 
that strands A and G are unzipped simultaneously. Detachment 
of the FG hairpin in one chunk does not occur in the set of tra- 
jectories analyzed for Fig. 4. Finally, we note that in the few 
trajectories where G occurs as an intermediate, the FG interme- 
diate is always visited as well, but never AG. Similarly, the few 
trajectories where the A intermediate occurs also contain the AG 
intermediate, but not AB. We find no example where the AB in- 



7 




Figure 4: Illustration of the diversity of unfolding pathways in the 100 constant-velocity unfolding simulations at t) = 0.03 fm/MC step. The numbers 
indicate how many of the trajectories follow a certain path. The boxes illustrate important structures along the pathways and boxes with dark rims 
correspond to the most long-lived states. Dark circles mark branch points. Most trajectories pass through G or A, but only a fraction spend a significant 
amount of time there (see Table 4). The line directly from G to U corresponds to events that either have no intermediate at all or only have intermediates 
other than the main three. The direct Unes N— >AB and N— > AG describe events that do not clearly pass through A or G and examples of structures seen 
in those events are illustrated by the unboxed cartoons next to the lines. 
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ment such that two strands are favorably positioned for unzipping, 
which explains why the intermediates with only A or G detached 
have a low force-resistance (see Tables 3 and 4). The AG, AB and 
FG intermediates, on the other hand, have cores that are pulled 
longitudinally, which makes them more resistant. Also worth not- 
ing is that the core of the AG intermediate is flipped 180°, which 
is not the case for the AB and FG intermediates. 

The end-to-end distance of the intermediates cannot be directly 
compared with experimental data. The experiments measured 
[12] contour-length differences rather than L, through worm-like 
chain (WLC) [45] fits to constant-velocity data. Using data at our 
lowest pulling velocity (0.03 fm/MC step), we now mimic this 
procedure. For each force peak, we determine a contour length 
Lc by fitting the WLC expression 



[4(1 



i i + ^ 

■ z/Lc)2 4 Lc 



(7) 



to data. Here if denotes the persistence length and z is the elonga- 
tion, defined as z = L — L-j^, where is the end-to-end distance 
of the native state. Following [12], we use a fixed persistence 
length of C = 0.4 nm. 

After each rupture peak follows a region where the force is rel- 
atively low. Here it sometimes happens that the newly released 
chain segment forms a -helical structures, indicating that our sys- 
tem is not perfectly described by the simple WLC model. Nev- 
ertheless, the WLC model provides a quite good description of 
our unfolding traces, as illustrated by Fig. 5. The figure shows a 
typical unfolding trajectory with three force peaks, corresponding 
to the native (N), intermediate (I) and unfolded (U) states, respec- 
tively. From the fitted Lc values, the contour-length differences 
ALc(N I), ALc(I U) and ALc(N U) can be calcu- 
lated. 

Fig. 6 shows a histogram of ALc(N — > I), based on our 100 
trajectories for v = 0.03 fm/MC step. For a small fraction of the 
force peaks, a WLC fit is not possible; e.g., the A state cannot be 
analyzed due to its closeness to the native state. All intermediates 
analyzed have a ALc(N — > I) in the range 6-27 nm. They are di- 
vided into five groups: AB, AG, FG, G and "other". Most of those 
in the category "other" have five strands detached (CDEFG or 
ABEFG) and a ALc(N — > I) larger than 21 nm. These intermedi- 
ates were not identified in the experimental study [12], which did 
not report any ALc(N — > I) values larger than 18 nm. These high- 
L intermediates mainly occur as a second intermediate, following 
one of the main intermediates, which perhaps explains why they 
were not observed in the experiments. The few remaining inter- 
mediates in the category "other" are all of the same kind, ABG, 
but show a large variation in ALc(N — > I), from 10 to 19 nm. 
The small values correspond to states where strand B actually is 
attached to the structured core, but through non-native hydrogen 
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Figure 5: WLC fits (Eq. |7) to a typical force-extension curve at d = 
0.03 fm/MC step. The aiTows indicate contour-length differences ex- 
tracted from the fits: ALc(N I), ALc(I U) and ALc(N U). 
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Figure 6: Histogram of the contour-length difference ALc(N — > I), ob- 
tained by WLC fits (Eq.|7) to our data for v = 0.03 fm/MC step. A total 
of 121 force peaks corresponding to intermediate states are analyzed. The 
intermediates ai'e divided into five groups: AB, AG, FG, G and "other". 
The experimental ALc(N — > I) distribution, from [12], is also indicated. 



bonds. 

The three major peaks in the ALc(N — > I) histogram (Fig. 6) 
correspond to the AG, AB and FG intermediates. Although simi- 
lar in size, these states give rise to well separated peaks, the means 
of which differ in a statistically significant way (see Table 6). For 
comparison, Fig. 6 also shows the experimental ALc(N I) dis- 
tribution [12]. The statistical uncertainties appear to be larger in 
the experiments, because the distribution has a single broad peak 
extending from 6 to 18 nm. All our ALc(N — > I) data for the 
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state ALc(N -> I) (nm) 



AG 


12.1 ±0.3 


AB 


10.1 ±0.1 


FG 


13.4 ±0.3 


G 


8.2 ±0.9 



Table 6: The average contour-length dilference ALc(N— > I) for dif- 
ferent intermediates, as obtained by WLC fits (Eq. |7) to our data for 
V = 0.03 fm/MC step. 



AB, AG, FG and G intermediates fall within this region. The oc- 
currence of these four intermediates is thus consistent with the 
experimental ALc(N I) distribution. The highest pealc, corre- 
sponding to the AG intermediate, is located near the center of the 
experimental distribution. 

Transitions from the native state directly to the unfolded state 
do not occur in the trajectories analyzed for Fig. 6. For the 
contour-length difference between these two states, we find a 
value of ALc(N — > U) = 30.9 ±0.1 nm, in perfect agreement 
with experimental data [12]. 



Estimating the free-energy profile 

We now present the free-energy profile obtained by applying 
Eqs.|4}{6]to the constant-velocity trajectories. The number of tra- 
jectories analyzed can be seen in Table 1. Fig. 7 shows the free- 
energy landscape at zero force, Gq(L), against the end-to-end dis- 
tance L, as obtained using different velocities v. We observe a 
collapse of the curves in the region of small-to-moderate L. Fur- 
thermore, the range of L where the curves superimpose, expands 
as decreases. As discussed in [28-30, 42], the collapse of the 
reconstructed free-energy curves, as the manipulation rate is de- 
creased, is a clear signature of the reliability of the evaluated free- 
energy landscape. Given our computational resources, we are not 
able to further decrease the velocity v, and for L > 15 nm there is 
still a difference of ~ 40 k^T between the two curves correspond- 
ing to the lowest velocities. The best estimate we currently have 
for G(){L) is the curve obtained with v = 0.03 fm/MC step. This 
curve will be used in the following analysis. 

Let us consider the case where a constant force F is ap- 
plied to the chain ends. The free energy then becomes G(L) = 
Gq(L) — F ■ L. The tilted free-energy landscape G{L) is espe- 
cially interesting for small forces for which the unfolding process 
is too slow to be studied through direct simulation. 

Fig. 8 shows our calculated G(L) for four external forces in 
the range 10-50 pN. At F = 10 pN, the state with minimum 
free energy is still the native one, and no additional local min- 
ima have appeared. At F = 25 pN, the situation has changed. For 
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Figure 7: Free-energy landscape GoiL) calculated as a function of the 
end-to-end distance L, using data at different pulling velocities d (given 
in fm/MC step). In the calculations, L is discretized with a bin size of 
AL = 0.4nm for v = l.Ofm/MC step and AL = 0.2 nm for all the other 
velocities (see Model and Methods). 



20 < F < 60 pN, we find that G(L) exhibits three major min- 
ima: the native minimum and two other minima, one of which 
corresponds to the fully unfolded state. The fully unfolded state 
takes over as the global minimum beyond F = Fc ^ 22 pN. The 
statistical uncertainty on the force at which this happens, Fc, is 
large, due to uncertainties on G(L) for large L, as will be further 
discussed below. For F = 25 pN, the positions of the three major 
minima are 4.3 nm, 12 nm and 25 nm. As F increases, the min- 
ima move slightly toward larger L; for F = 50 pN, their positions 
are 4.6 nm, 14 nm and 29 nm. The first two minima become in- 
creasingly shallow with increasing F. For F > 60 pN, the only 
surviving minimum is the third one, corresponding to the com- 
pletely unfolded state. 

These results have to be compared with the analysis above, 
which showed that the system, on its way from the native to 
the fully unfolded state, often spends a significant amount of 
time in some partially unfolded intermediate state with L around 
12-16 nm. These intermediates should correspond to local free- 
energy minima along different unfolding pathways, but may or 
may not correspond to local minima of the global free energy 
G(L), which is based on an average over the full conformational 
space. As we just saw, it turns out that G(L) actually exhibits a 
minimum around 12-16 nm, where the most common intermedi- 
ates are found. It is worth noting that above ~ 25 pN this mini- 
mum gets weaker with increasing force. This trend is in agree- 
ment with the results shown in Table 2: the fraction of apparent 
two-state events, without any detectable intermediate, increases 
with increasing force. 
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Figure 8: Tilted free-energy landscape G(L) = Go(i-) — F ■ L for four 
different forces F. The unperturbed landscape Gq(L) corresponds to the 
curve shown in Fig. 7 for u = 0.03 fm/MC step. The minima of G (L) are 
discussed in the text. 

For F — 25 pN and F = 35 pN, a fourth minimum can also be 
seen in Fig. 8, close to the native state. Its position is 6 nm. This 
minimum is weak and has already disappeared for F = 50 pN. It 
corresponds to a state in which the two native /?-sheets are slightly 
shifted relative to each other and aligned along the direction of the 
force, with all strands essentially intact. The appearance of this 
minimum is in good agreement with the results of Gao et al. [17]. 
In their unfolding trajectories, Gao et al. saw two early plateaus 
with small L, which in terms of our G(L) should correspond to 
the native minimum and this L ^ 6 nm minimum. In our model, 
the L V 6 nm minimum represents a non-obligatory intermedi- 
ate state; in many unfolding events, especially at high force, the 
molecule does not pass this state. 

Finally, Fig. 9 illustrates a more detailed analysis of the native 
minimum of G(L), for 20 pN < F < 60 pN. In this force range, 
we find that the first barrier is always located at L = 5.0 nm, 
whereas the position of the native minimum varies with force (see 
inset of Fig. 9). Hence, the distance between the native mini- 
mum and the barrier, x^, depends on the applied force, as ex- 
pected [46-49]. Fig. 9 shows the force-dependence of the bar- 
rier height, AG(F). The solid line is a linear fit with slope 
= 0.4 nm, which describes the data quite well in the force 
range 25-56 pN. At lower force, the force-dependence is steeper; 
a linear fit to the data at low force gives a slope of = 0.8 nm 
(dashed line). Using this latter fit to extrapolate to zero force, 
we obtain a barrier estimate of AG(0) ^ 5kcal/mol. Due to 
the existence of the non-obligatory L ^ 6 nm intermediate, it is 
unclear how to relate this one-dimensional free-energy barrier to 
unfolding rates. Experimentally, barriers are indirectly probed. 
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Figure 9: Free-energy hairier AG, separating the native state from ex- 
tended conformations, as a function of the pulling force F. The solid line 
is a Hnear fit to the data for forces F > 25 pN, while the dashed line refers 
to a linear fit to the data in the interval 15 pN < F < 30 pN. The inset 
shows the free energy G(L) in the vicinity of the native state for three 
values of the force. The vertical dashed line indicates the position of the 
banier. 

using unfolding kinetics. For Fnlllig, experiments found a zero- 
force barrier of 22.2kcal/mol [11], using kinetics. For the un- 
folding length, an experimental value of = 0.38 nm was re- 
ported [11], based on data in the force range 50-1 15 pN. Our 
result Xu = 0.4 nm obtained using the overlapping force range 
25-56 pN, is in good agreement with this value. 

Discussion 

By AFM experiments, Li et al. [12] showed that Fnlllio unfolds 
through intermediates when stretched by an external force. AFM 
data for the wild-type sequence and some engineered mutants 
were consistent with the existence of two distinct unfolding path- 
ways with different intermediates, one being the AB state with 
strands A and B detached and the other being either the AG or the 
FG state [12]. This conclusion is in broad agreement with sim- 
ulation results obtained by Paci and Karplus [15] and by Gao et 
al. [17]. 

Comparing our results with these previous simulations, one 
finds both differences and similarities. In our simulations, three 
major intermediates are observed: AB, which was seen by Paci 
and Karplus as well as by Gao et al.; AG, also seen by Paci and 
Karplus; and FG, which was not observed in previous studies. 
The most force-resistant intermediate is AB in our as well as in 
previous studies. Frequencies of occurrence of the intermediates 
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are difficult to compare because the previous studies were based 
on fewer trajectories. Nevertheless, one may note that the most 
common intermediate in our simulations, AG, is one of two in- 
termediates seen by Paci and Karplus, and corresponds to one of 
three pathways observed by Gao et al. A and G often being the 
first two strands to break is also in agreement with the simulation 
results of Klimov and Thirumalai [16], who studied several dif- 
ferent proteins using a simplified model. Unlike us, these authors 
found a definite unfolding order for the yS-strands. The first strand 
to break was G, followed by A. 

A key issue in our study is how the unfolding pathway depends 
on the pulling strength. This question was addressed by Gao et 
al. [17]. Based on a simple analytical model rather than simula- 
tions, it was argued that there is a single unfolding pathway at low 
force and multiple unfolding pathways at high force. Our results 
show the opposite trend. At our lowest force, 50 pN, we observe 
several different unfolding pathways, and all the three major inter- 
mediates occur with a significant frequency. At our highest force, 
192 pN, unfolding occurs either in one step or through one par- 
ticular intermediate, the AG state. Moreover, at 192 pN, the same 
two strands, A and G, are almost always the first to break in the 
apparent one-step events as well. Hence, at our highest force, we 
find that the unfolding behavior has an essentially deterministic 
component. The trend that the unfolding pathway becomes more 
deterministic with increasing force can probably be attributed to 
a reduced relative importance of random thermal fluctuations. 

There is a point of disagreement between our results and ex- 
perimental data, which is that the rupture forces of the three ma- 
jor intermediates are higher in our constant-velocity simulations 
than they were in the experiments [12]. Although the statisti- 
cal uncertainties are non-negligible and the pulling conditions are 
not identical (e.g., we consider a single Fnllljo module, while 
the experiments studied multimodular constructs), we do not see 
any plausible explanation of this discrepancy. It thus seems that 
our model overestimates the rupture force of these intermediates. 
Our calculated rupture force for the native state is consistent with 
experimental data (see above). To make sure that this agree- 
ment is not accidental, we also measured the rupture force of 
the native state for three other domains, namely Fnlllj2. Fnllli3 
and the titin 127 domain. ATM experiments (at 0.6 /zm/s) found 
that these domains differ in force-resistance, following the order 
FnIIIi3(~90pN) < Fnllli2(~ 120pN) < I27(~200pN) [11]. 
For each of these domains, we carried out a set of 60 unfolding 
simulations, at a constant velocity of 0. lOfm/MC step. The aver- 
age rupture forces were 108 ± 4pN for Fnllli3, 135 ± 4pN for 
Fnlll|2. and 159 ± 6pN for 127, which is in reasonable agree- 
ment with experimental data. In particular, our model correctly 
predicts that the force-resistance of the native state decreases as 
follows: 127 > FnIIl]2> Fnlllj3~ Fnllljo- Similar findings have 



been reported for another model [18]. 

Throughout the paper, times have been given in MC steps. In 
order to roughly estimate what one MC step corresponds to in 
physical units, we use the average unfolding time of the native 
state, which is ~ 4 ■ 10** MC steps at our lowest force, 50 pN. As- 
suming that the force-dependence of the unfolding rate is given 
by k{F) = /toexp(Fxu//tBr) [50] with .Yu = 0.38nm [11], 
this unfolding time corresponds to a zero-force unfolding rate of 
fcg ~ 1/(4 • lO'^MC steps). Setting this quantity equal to its ex- 
perimental value, fcg = 0.02 s~' [11], gives the relation that one 
MC step corresponds to 1 • 10"'' s. Using this relation to trans- 
late our pulling velocities into physical units, one finds, for exam- 
ple, that 0.05 fm/MC step corresponds to 0.05 /(m/s. This estimate 
suggests that the effective pulling velocities in our simulations are 
comparable to or lower than the typical pulling velocity in the ex- 
periments [12], which was 0.4/;m/s. That the effective pulling 
velocity is low in our simulations is supported by the observation 
made earlier that the force drops to very small values between the 
rupture peaks. 

The force range studied in our simulations is comparable to 
that studied in AFM experiments [11-13]. The exact forces act- 
ing on fibronectin under physiological conditions are not known, 
but might be considerably smaller. For comparison, it was esti- 
mated that physiologically relevant forces for the muscle protein 
titin are ~4pN per I-band molecule [51]. For so small forces, 
the unfolding of Fnllljo occurs too slowly in the model to per- 
mit direct simulation. Therefore, we cannot characterize unfold- 
ing pathways and possible intermediates for these forces. On the 
other hand, we have an estimate of the free-energy profile G(L) 
for arbitrary force, which can be used, in particular, to estimate 
the force Fc, beyond which the fully extended state has minimum 
free energy. Using our best estimate of G(L), one finds an of 
22 pN (see above). Now, Fc depends on the behavior of G(L) for 
large L, where the uncertainties are large and not easy to accu- 
rately estimate. As a test, we therefore repeated the same analysis 
using the Ising-like model of [28, 29, 38] (unpublished results), 
which gave us the estimate Fc ~ 20 pN, in quite good agree- 
ment with the value found above (22 pN). Together, we take these 
results to indicate that Fc > 15 pN, which might be large com- 
pared to physiologically relevant forces (see above). For stretch- 
ing forces F significantly smaller than Fc , the statistical weight of 
the fully stretched state is small. To estimate the suppression, let 
Ln and Ls be the end-to-end distances of the native and stretched 
states. The free energies of these states at force F can be written 
as Gn = G^ — (F — Fc)Ln and Gs = G^ — (F — Fc)Ls, where 
G^ and G^ are the free energies at Fc. Assuming G^ — G^, 
15 pN and Ls — Lj^ ~ 20 nm, one finds, for example, that 
Gs - Gn > 25A:Br for F < 10 pN. Our estimate Fc > 15pN 
thus indicates that unfolding of Fnllljo to its fully stretched state 
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is a rare event for stretching forces F < 10 pN. The major in- 
termediates are also suppressed compared to the native state for 
F < 10 pN (see Fig. 8). However, our results indicate that the 
major intermediates are more likely to be observed than the fully 
stretched state for these forces. 

By extrapolating from experimental data at zero force, the force 
at which the native and fully stretched states have equal free en- 
ergy has been estimated to be 3.5-5 pN for an average Fnlll do- 
main [52]. Our results suggest that the native state remains ther- 
modynamically dominant at so small forces. 

The reconstructed free energies Gq(L) and G(L) are thermo- 
dynamical potentials describing the equilibrium behavior of the 
system in the absence and presence of an external force F, re- 
spectively. On the other hand, the long-lived intermediate states 
observed during the unfolding of the molecule are a clear signa- 
ture of out-of-equilibrium behavior They indicate an arrest of 
the unfolding kinetics, typically in the L range 12-16 nm, on the 
way from the old (native) equilibrium state to the new (fully un- 
folded) equilibrium state. The calculated (equilibrium) landscape 
G{L) (see Fig. 8) is to some extent able to describe this out-of- 
equilibrium behavior. For 20 < F < 60 pN, this function exhibits 
three major minima corresponding to the folded state, the most 
common intermediates, and the fully unfolded state, respectively. 
However, since G(L) describes the system in terms of a single co- 
ordinate L and "hides" the microscopic configuration, one cannot 
extract the full details of individual unfolding pathways from this 
function. For example, one cannot, based on G(L), distinguish 
the AG, AB and FG intermediates, which have quite similar L. 

The height of the first free-energy barrier, AG, can be related 
to the unfolding length Xu, a parameter typically extracted from 
unfolding kinetics, assuming the linear relationship AG(F) = 
A Go — • i^u • The parameter measures the distance between the 
native state and the free-energy barrier, which generally depends 
on force. Our data for Xu indeed show a clear force-dependence 
(see inset of Fig. 9). However, over a quite large force interval, our 
Xu is almost constant and similar to its experimental value [12], 
which was based on an overlapping force interval. 

Conclusion 

We have used all-atom MC simulations to study the force-induced 
unfolding of the fibronectin module Fnlllio, and in particular how 
the unfolding pathway depends on the pulling conditions. Both at 
constant force and at constant pulling velocity, the same three ma- 
jor intermediates were seen, all with two native /?-strands missing: 
AG, AB or FG. Contour-length differences ALc(N ^ I) for these 
states were analyzed, through WLC fits to constant- velocity data. 
We found that the states, in principle, can be distinguished based 



on their ALc(N I) distributions, but the differences between 
the distributions are small compared to the resolution of existing 
experimental data. 

The unfolding behavior at constant force was examined in the 
range 50-192 pN. The following picture emerges from this analy- 
sis: 

1. At the lowest forces studied, several different unfolding 
pathways can be seen, and all the three major intermediates 
occur with a significant frequency. 

2. At the highest forces studied, the AB and FG intermediates 
are very rare. Unfolding occurs either in an apparent single 
step or through the AG intermediate. 

3. The unfolding behavior becomes more deterministic with 
increasing force. At 192 pN, the first strand pair to break 
is almost always A and G, also in apparent two-state events. 

The dependence on pulling velocity in the constant-velocity 
simulations was found to be somewhat less pronounced, com- 
pared to the force-dependence in the constant-force simulations. 
Nevertheless, some clear trends could be seen in this case as well. 
In particular, with increasing velocity, we found that the AG state 
becomes increasingly dominant among the intermediates. Our re- 
sults thus suggest that the AG state is the most important interme- 
diate both at high constant force and at high constant velocity. 

The response to weak pulling forces is expensive to simulate; 
our calculations, based on a relatively simple and computationally 
efficient model, extended down to 50 pN. The Jarzynski method 
for determining the free energy G(L) opens up a possibility to 
partially circumvent this problem. Our estimated G(L), which 
matches well with several direct observations from the simula- 
tions, indicates, in particular, that stretching forces below 10 pN 
only rarely unfold Fnllljo to its fully extended state. Although 
supported by calculations based on a different model, this con- 
clusion should be verified by further studies, because accurately 
determining G(L) for large L is a challenge. 
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